What Can Syntax-Based MT Learn from Phrase-Based MT?
نویسندگان
چکیده
We compare and contrast the strengths and weaknesses of a syntax-based machine translation model with a phrase-based machine translation model on several levels. We briefly describe each model, highlighting points where they differ. We include a quantitative comparison of the phrase pairs that each model has to work with, as well as the reasons why some phrase pairs are not learned by the syntax-based model. We then evaluate proposed improvements to the syntax-based extraction techniques in light of phrase pairs captured. We also compare the translation accuracy for all variations.
منابع مشابه
Chained System: A Linear Combination of Different Types of Statistical Machine Translation Systems
The paper explores a way to learn post-editing fixes of raw MT outputs automatically by combining two different types of statistical machine translation (SMT) systems in a linear fashion. Our proposed system (which we call a chained system) consists of two SMT systems: (i) a syntax-based SMT system and (ii) a phrase-based SMT system (Koehn, 2004). We first translate source sentences of the bite...
متن کاملCombining Translation Memories and Syntax-Based SMT: Experiments with Real Industrial Data
One major drawback of using Translation Memories (TMs) in phrase-based Machine Translation (MT) is that only continuous phrases are considered. In contrast, syntax-based MT allows phrasal discontinuity by learning translation rules containing non-terminals. In this paper, we combine a TM with syntax-based MT via sparse features. These features are extracted during decoding based on translation ...
متن کاملHierarchical MT Training using Max-Violation Perceptron
Large-scale discriminative training has become promising for statistical machine translation by leveraging the huge training corpus; for example the recent effort in phrase-based MT (Yu et al., 2013) significantly outperforms mainstream methods that only train on small tuning sets. However, phrase-based MT suffers from limited reorderings, and thus its training can only utilize a small portion ...
متن کاملDecoding with Syntactic and Non-Syntactic Phrases in a Syntax-Based Machine Translation System
A key concern in building syntax-based machine translation systems is how to improve coverage by incorporating more traditional phrase-based SMT phrase pairs that do not correspond to syntactic constituents. At the same time, it is desirable to include as much syntactic information in the system as possible in order to carry out linguistically motivated reordering, for example. We apply an exte...
متن کاملThird Workshop on Syntax and Structure in Statistical Translation
A key concern in building syntax-based machine translation systems is how to improve coverage by incorporating more traditional phrase-based SMT phrase pairs that do not correspond to syntactic constituents. At the same time, it is desirable to include as much syntactic information in the system as possible in order to carry out linguistically motivated reordering, for example. We apply an exte...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007